Application of Distance Metric Learning to Automated Malware Detection

نویسندگان

چکیده

Distance metric learning aims to find the most appropriate distance parameters improve similarity-based models such as k-Nearest Neighbors or k-Means. In this paper, we apply problem of malware detection. We focus on two tasks: (1) classify and benign files with a minimal error rate, (2) detect much possible while maintaining low false positive rate. propose detection system using Particle Swarm Optimization that finds feature weights optimize similarity measure. compare performance approach three state-of-the-art techniques. metrics trained in way lead significant improvements classification. conducted evaluated experiments more than 150,000 Windows-based samples. Features consisted metadata contained headers executable portable file format. Our experimental results show our based achieves 1.09 % rate at 0.74 (FPR) outperforms all machine algorithms considered experiment. Considering second task related keeping FPR, achieved 1.15 only 0.13 FPR.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distance Metric Learning for Conditional Anomaly Detection

Anomaly detection methods can be very useful in identifying unusual or interesting patterns in data. A recently proposed conditional anomaly detection framework extends anomaly detection to the problem of identifying anomalous patterns on a subset of attributes in the data. The anomaly always depends (is conditioned) on the value of remaining attributes. The work presented in this paper focuses...

متن کامل

Distance Metric Learning with Application to Clustering with Side-Information

Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many “plausible” ways, and if a clustering algorithm such as K-means initially fails to find one that is meaningful to a user, the only recourse may be for the user to manually tweak the metric until sufficiently good clusters are found. For these and other applications r...

متن کامل

Bayesian Distance Metric Learning

This thesis explores the use of Bayesian distance metric learning (Bayes-dml) for the task of speaker verification using the i-vector feature representation. We propose a framework that explores the distance constraints between i-vector pairs from the same speaker and different speakers. With an approximation of the distance metric as a weighted covariance matrix of the top eigenvectors from th...

متن کامل

Distance Metric Learning Revisited

The success of many machine learning algorithms (e.g. the nearest neighborhood classification and k-means clustering) depends on the representation of the data as elements in a metric space. Learning an appropriate distance metric from data is usually superior to the default Euclidean distance. In this paper, we revisit the original model proposed by Xing et al. [24] and propose a general formu...

متن کامل

Hamming Distance Metric Learning

Motivated by large-scale multimedia applications we propose to learn mappings from high-dimensional data to binary codes that preserve semantic similarity. Binary codes are well suited to large-scale applications as they are storage efficient and permit exact sub-linear kNN search. The framework is applicable to broad families of mappings, and uses a flexible form of triplet ranking loss. We ov...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2021

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2021.3094064